在这项工作中,我们提出了一种可以使用单变量全局优化器来解决多元全局优化问题的元算法。尽管与多元案例相比,单变量的全球优化并没有得到太多关注,而多元案例在学术界和行业中更加强调。我们表明它仍然是相关的,可以直接用于解决多元优化的问题。我们还提供了相应的遗憾界限,并在具有强大的遗憾保证的情况下对非负噪声的强劲性,而单变量优化器的平均遗憾和单变量优化器的平均遗憾。
translated by 谷歌翻译
我们使用专家建议设置研究预测,其目的是结合一组专家(例如独立运行算法)产生的决策来做出决定。我们通过专家咨询设置的预测实现了最小的最佳动态遗憾,即,我们可以以最佳的方式与专家决策的时变(不一定是固定)组合竞争。我们的最终算法是真正在线的,没有先前的信息,例如时间范围或损失范围,文献中不同算法通常使用它们。我们的遗憾保证和Min-Max的下限都是普遍考虑的,即专家损失可以具有时间变化的属性,并且可能是无限的。我们的算法可以针对损失反馈和决策的限制性方案进行调整。我们的保证是普遍的,即,我们的最终算法可以以最大的最佳方式以对数复杂性提供对任何竞争对手序列的后悔保证。请注意,据我们所知,为了提出专家建议问题的预测,我们的算法是第一个在没有先验知识的情况下产生这种普遍最佳,适应性和真正的在线保证的。
translated by 谷歌翻译
我们通过操纵传统移动卑鄙的Smoothers的固有目标功能来研究一种自动回归公式,以解决平滑时间序列的问题。不仅自动回归型Smoother强制执行更高的平滑度,而且它们与传统移动手段一样有效,并且可以相对于输入数据集进行相应优化。有趣的是,自动回归模型通过指数锥形的窗口导致移动手段。
translated by 谷歌翻译
我们调查估计估计,以提高性能在估计器输出上的最佳单调转换。我们首先使用其加权变体研究传统的方误差设置,并显示最佳单调变换是唯一楼梯功能的形式。我们进一步表明,对于一般严格凸损函数,保留了该阶梯行为。它们的最佳单调变换也是唯一的,即,存在一个阶梯变换,实现最小损耗。我们提出了一种线性时间和空间算法,可以找到特定损耗设置的这种最佳变换。我们的算法具有在线实现,其中到目前为止所观察到的样本的最佳变换在样品以有序方式到达时在线性空间和摊销时间。我们还将结果扩展到功能不普遍以单独优化和提出具有线性空间和伪线性时间复杂度的函数不普通的情况。
translated by 谷歌翻译
我们介绍了一种在线凸优化算法,该算法利用了预测的亚级别下降,并具有最佳的自适应学习率。我们的方法为一系列一系列一般凸函数提供了二阶最小动态遗憾保证(即取决于平方的亚级别规范的总和),这些序列可能没有强大的凸度,平滑度,表现出色甚至Lipschitz-continunition。遗憾的保证是反对具有有界路径变化(即连续决策之间的距离之和)的任何比较者决策顺序。我们通过合并实际的亚级别规范来生成最坏的二阶动态遗憾的下限。我们表明,这种下限与我们在不变的因素内的遗憾保证匹配,这使我们的算法最小值最佳。我们还得出每个决策坐标的扩展。当比较器序列的路径变化的界限随着时间的流逝而增长或随着时间的流逝而部分到达时,我们演示了如何最好地保留我们的遗憾保证。我们进一步以算法为基础,以消除对比较路径变化的任何知识的需求,并在没有先验信息的情况下提供最小值的最佳二阶遗憾保证。我们的方法可以以最小的最佳方式(即每个遗憾保证)同时(普遍)(普遍)与所有比较序列竞争,这取决于相应的比较路径变化。我们讨论了对我们的方法的修改,以解决时间,计算和内存的复杂性降低。除了相应的路径变化外,我们还通过使遗憾保证还取决于比较器集的直径来进一步改善结果。
translated by 谷歌翻译
Long-term non-prehensile planar manipulation is a challenging task for robot planning and feedback control. It is characterized by underactuation, hybrid control, and contact uncertainty. One main difficulty is to determine contact points and directions, which involves joint logic and geometrical reasoning in the modes of the dynamics model. To tackle this issue, we propose a demonstration-guided hierarchical optimization framework to achieve offline task and motion planning (TAMP). Our work extends the formulation of the dynamics model of the pusher-slider system to include separation mode with face switching cases, and solves a warm-started TAMP problem by exploiting human demonstrations. We show that our approach can cope well with the local minima problems currently present in the state-of-the-art solvers and determine a valid solution to the task. We validate our results in simulation and demonstrate its applicability on a pusher-slider system with real Franka Emika robot in the presence of external disturbances.
translated by 谷歌翻译
The classification loss functions used in deep neural network classifiers can be grouped into two categories based on maximizing the margin in either Euclidean or angular spaces. Euclidean distances between sample vectors are used during classification for the methods maximizing the margin in Euclidean spaces whereas the Cosine similarity distance is used during the testing stage for the methods maximizing margin in the angular spaces. This paper introduces a novel classification loss that maximizes the margin in both the Euclidean and angular spaces at the same time. This way, the Euclidean and Cosine distances will produce similar and consistent results and complement each other, which will in turn improve the accuracies. The proposed loss function enforces the samples of classes to cluster around the centers that represent them. The centers approximating classes are chosen from the boundary of a hypersphere, and the pairwise distances between class centers are always equivalent. This restriction corresponds to choosing centers from the vertices of a regular simplex. There is not any hyperparameter that must be set by the user in the proposed loss function, therefore the use of the proposed method is extremely easy for classical classification problems. Moreover, since the class samples are compactly clustered around their corresponding means, the proposed classifier is also very suitable for open set recognition problems where test samples can come from the unknown classes that are not seen in the training phase. Experimental studies show that the proposed method achieves the state-of-the-art accuracies on open set recognition despite its simplicity.
translated by 谷歌翻译
Recently the focus of the computer vision community has shifted from expensive supervised learning towards self-supervised learning of visual representations. While the performance gap between supervised and self-supervised has been narrowing, the time for training self-supervised deep networks remains an order of magnitude larger than its supervised counterparts, which hinders progress, imposes carbon cost, and limits societal benefits to institutions with substantial resources. Motivated by these issues, this paper investigates reducing the training time of recent self-supervised methods by various model-agnostic strategies that have not been used for this problem. In particular, we study three strategies: an extendable cyclic learning rate schedule, a matching progressive augmentation magnitude and image resolutions schedule, and a hard positive mining strategy based on augmentation difficulty. We show that all three methods combined lead up to 2.7 times speed-up in the training time of several self-supervised methods while retaining comparable performance to the standard self-supervised learning setting.
translated by 谷歌翻译
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. The large data sizes of graphs and their vertex features make scalable training algorithms and distributed memory systems necessary. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges. We propose a highly parallel training algorithm that scales to large processor counts. In our solution, the large adjacency and vertex-feature matrices are partitioned among processors. We exploit the vertex-partitioning of the graph to use non-blocking point-to-point communication operations between processors for better scalability. To further minimize the parallelization overheads, we introduce a sparse matrix partitioning scheme based on a hypergraph partitioning model for full-batch training. We also propose a novel stochastic hypergraph model to encode the expected communication volume in mini-batch training. We show the merits of the hypergraph model, previously unexplored for GCN training, over the standard graph partitioning model which does not accurately encode the communication costs. Experiments performed on real-world graph datasets demonstrate that the proposed algorithms achieve considerable speedups over alternative solutions. The optimizations achieved on communication costs become even more pronounced at high scalability with many processors. The performance benefits are preserved in deeper GCNs having more layers as well as on billion-scale graphs.
translated by 谷歌翻译
T\"urkiye is located on a fault line; earthquakes often occur on a large and small scale. There is a need for effective solutions for gathering current information during disasters. We can use social media to get insight into public opinion. This insight can be used in public relations and disaster management. In this study, Twitter posts on Izmir Earthquake that took place on October 2020 are analyzed. We question if this analysis can be used to make social inferences on time. Data mining and natural language processing (NLP) methods are used for this analysis. NLP is used for sentiment analysis and topic modelling. The latent Dirichlet Allocation (LDA) algorithm is used for topic modelling. We used the Bidirectional Encoder Representations from Transformers (BERT) model working with Transformers architecture for sentiment analysis. It is shown that the users shared their goodwill wishes and aimed to contribute to the initiated aid activities after the earthquake. The users desired to make their voices heard by competent institutions and organizations. The proposed methods work effectively. Future studies are also discussed.
translated by 谷歌翻译